7 research outputs found

    AGEWEB : les agents personnels d'aide à la recherche documentaire sur le Web

    Get PDF

    A Novel Hybrid Classification Approach for Sentiment Analysis of Text Document

    Get PDF
    Sentiment analysis is a more popular area of highly active research in Automatic Language Processing. She assigns a negative or positive polarity to one or more entities using different natural language processing tools and also predicted high and low performance of various sentiment classifiers. Our approach focuses on the analysis of feelings resulting from reviews of products using original text search techniques. These reviews can be classified as having a positive or negative feeling based on certain aspects in relation to a query based on terms. In this paper, we chose to use two automatic learning methods for classification: Support Vector Machines (SVM) and Random Forest, and we introduce a novel hybrid approach to identify product reviews offered by Amazon. This is useful for consumers who want to research the sentiment of products before purchase, or companies that want to monitor the public sentiment of their brands. The results summarize that the proposed method outperforms these individual classifiers in this amazon dataset

    Testing Sphinx’s language model fault-tolerance for the Holy Quran

    Get PDF
    The Carnegie Mellon University’s (CMU) Sphinx framework is increasingly used for the Arabic speech recognition in general and applied to the Holy Quran in particular. Generating the language model includes a tedious task of preparing the transcriptions for all the data. In this paper, we investigate the fault-tolerance of the automatically generated language model as compared to a corrected and uncorrected transcription with and without silence tagging. This editing addresses the different repetitions and pauses encountered during recitations. Experiments show that the average difference between the lowest and highest Word Error Rate (WER) for each configuration of the number of Senones is 0.6% when using all files for the training and 1.6% when using 80% of the files for training the language model of 17 chapters of the Holy Quran. Results show that the performance of trained models without any correction can be close to when all required rectifications of transcriptions are performed

    Towards an accurate speaker-independent Holy Quran acoustic model

    No full text
    The popularity of speech recognition tools keeps increasing with the processing power of mobile devices. The use of speech recognition for the Arabic in general and the Holy Quran, in particular, has also followed the same trend. Holy Quran speech recognition systems have been developed by increasing the training data. In this paper, a more accurate Carnegie Melon University Sphinx acoustic model was trained for the Holy Quran chapters 001, and 067 to 114. When more efforts were put into having a more accurate training data, the resulting Word Error Rate of trained acoustic model reached around 15%

    Building CMU Sphinx language model for the Ho

    Get PDF
    This paper investigates the use of a simplified set of Arabic phonemes in an Arabic Speech Recognition system applied to Holy Quran. The CMU Sphinx 4 was used to train and evaluate a language model for the Hafs narration of the Holy Quran. The building of the language model was done using a simplified list of Arabic phonemes instead of the mainly used Romanized set in order to simplify the process of generating the language model. The experiments resulted in very low Word Error Rate (WER) reaching 1.5% while using a very small set of audio files during the training phase when using all the audio data for both the training and the testing phases. However, when using 90% and 80% of the training data, the WER obtained was respectively 50.0% and 55.7%

    Building CMU Sphinx language model for the Holy Quran using simplified Arabic phonemes

    No full text
    This paper investigates the use of a simplified set of Arabic phonemes in an Arabic Speech Recognition system applied to Holy Quran. The CMU Sphinx 4 was used to train and evaluate a language model for the Hafs narration of the Holy Quran. The building of the language model was done using a simplified list of Arabic phonemes instead of the mainly used Romanized set in order to simplify the process of generating the language model. The experiments resulted in very low Word Error Rate (WER) reaching 1.5% while using a very small set of audio files during the training phase when using all the audio data for both the training and the testing phases. However, when using 90% and 80% of the training data, the WER obtained was respectively 50.0% and 55.7%.</p
    corecore